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Abstract 


This repon investigates the dimensionality of the 1992 NAEP mathematics 
test in the context of subgroup differences. A multidimensional model is 
supponed by these data with dimensions corresponding to both content- 
specific and format-specific factors. The analysis approach of this paper 
utilizes key grouping variables of the NAEP reports (e.g., gender, ethnicity), 
but has the advantage that subgroup comparisons are not only done in a 
univariate manner using one grouping variable at a time, but using the set of 
grouping variables jointly. This is carried out within a stmcmral model with 
latent variables, which relates the information on the test items to 
background information via a set of factors. It is found that the different 
factors relate differendy to the background variables. The multidimensional 
latent variable modeling also suggests a new way of reporting results with 
respect to math performance in specific content areas. For content-specific 
performance, the subscores are related to overall performance, considering 
content-specific scores condidonal on overall scores. For a given overall 
score a subgroup difference is considered with respect to a certain content 
area. This conditional approach may be of value for revealing differences in 
opportunity to learn or differences in curricular emphases. Conditional 
differences may be viewed as "uruealized potential'' for performance in a 
specific content area. 


Introduction 


This repon examines mathematics achievement data from the National 
Assessment of Educational Progress (N AEP). NAEP is a regularly 
administered. Congressionally mandated assessment program for the nanon 
and the states. NAEP test results for grades 4, 8. and 12 are reponed for 
various subgroups of the U.S. school population. The most recent 
mathematics report. "NAEP 1992 Mathematics Report Card for the Nation 
and the States ' (MuUis. Dossey. Owen. Phillips. 1993), includes overall 
mathematics proficiencies for subgroups based on region, gender, ethmcry, 
type of community, parents' highest level of education, and type of schooL 
Proficiencies for the entire group ate also repotted for the specific content 
areas of numbers and operations; measurement: geometry; data analysis, 
statistics, and probabiUty: and algebra and functions. Content-specific 
snbgroup comparisons are given in the NAEP Data Almanacs. 

nre aim of this report is to investigate the dimensionaUty of the mathematics 
test Tliis test consists of a large number of items distributed over a number 
of test forms to which smdents are randomly assigned. In analyzing 1990 
NAEP math data, it was suggested that the math items ate essentiaUy 
miidimensional with respect to content areas with the possible exception of 
geometry in grade 8 (Rock. 1991). Support for unidimensionality is usually 
based on finding correlations close to unity among factors representing 
various aspects of the items. Rock's analysis of content areas showed 
correlations in the range 0.86 - 0.95 for grades four, eight, and twelve. 
UnidimensionaUty was also indicated in analyses considering item foimat 
(Carlson & lirele, 1993). Using the 1992 data a mote detailed analysis with 
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respect to item format was given in Mazzeo, Yamamoto, and Kulick (1993). 
The 1992 test included both shon constmcted-response items and extended 
constmcted-response items in addition to the traditional item format of 
multiple-choice items. The Mazzeo et al. analysis found an impor tant 
deviation from unidimensionaiity only for extended constructed-response 
items. In 1992, however, extended constmcted-response items made up less 
than 4 % of the total number of items for grades 4, 8, and 12. 

As mentioned above, NAEP reports subgroup differences with respect to 
overall math performance, whereas content-speciHc performance is typically 
not reported for subgroups. Given the indications of unidimensionaiity, one 
may in fact ask if content-specific reporting is at ail necessary, or if die 
overall reporting is sufficient The idea of simplified reporting has be en 
discussed among ETS researchers. For example, in analyzing 1990 NAEP 
math data Rock (1991) concluded that "there seems to be little discriminant 
validity here. In conclusion, it would seem that we are doing little damage 
in using a composite score." 

In our view, entertaining the notion of unidimensionaiity, although useful for 
simplified reporting, may leave interesting features of the data unexplored. 
As shown in the appendix, it is not hard to settle for unidimensionaiity 
unless a special effort is made to find meaningful additional dimmsions. 
This paper argues that the need for a multidimensional representation of die 
data is difficult to judge based on the conventional approach reported above 
of estimating correlations in multifactorial models. This paper goes beyond 
tlte conventional approach in two respects. First, it uses a latent variable 

model that is more sensitive to capturing deviations from unidimensionaiity. 
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Significance of adding these further dimensions, the same su grou 
NAEP compares are also compared using ihe muludimension mo 

NAEFs estimation of subgroup differences is based on a sutistically- 
complex procedure where proficiencies are estimated based not only on 

studLtperformance,butalsoonbachgroundvariablesCcondmonmg 

variables ’! including those used for subgroups in the repons. e^^ 
methodology of this paper utiUzes the key groupmg vana 
mports (e.g., gender, ethnicity), but has the advantage that subgro 

rClns-donenototdyinaunivariatetrrennerus^^^^^^ 

variable at a time, but using the set of grouping vanables jom y^ 
carried out widun a structural model with latent variables, whtch re^ 

Urforrrretlonon the t«t items tobackgroundirdormation.tath^^ 

..mcturalmodellssimilartotheframeworkusedbyNAEPm^ 

proficiencies for the subgroups. The results are not, oweve , 

fi„t estirnating proficiencies using conditioning varia^^^ 

methodology has the further benefit of providing a v on 
procedure. 

Tte multidimensional latent variable modeling used here also suggests a 

new way of reporting results with respect to math performance “ 

content areas. For content-specific petfonnance, we propose re 

subscotes to overall perforaiance. considering content-specific 
subgroup difference is with respect to a certam content area. 
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may show that two individuals with the same overall score but belonging to 
different subgroups are expected to perform quite differently in a particular 
content area. This conditional approach gives a sharper focus in the 
reporting. It may be of value for revealing differences in opportunity to 
learn or differences in curricular emphases. Conditional differences may be 
viewed as "imrealized potential" for performance in the specific content area. 

Method 

Samples 

Mathematics data from the 1992 NAEP main assessment are used (the 
"Main Focused-BIB Assessment"). NAEP is a multistage probability 
sample with three stages of selection: primary sampling units (PSlTs) 
defined by geographical areas, schools within PSU’s, and students within 
schools. In the 1992 NAEP main assessment 26 different test forms were 
used, each taken by almost 4(X) students in each of grades 4, 8, and 12, 
resulting in test results for almost 10,(X)0 students per grade. The analyses 
in this paper will focus on grade 8 and grade 12. Given missing data on 
some of the background variables used in the present analyses, the sample 
sizes are 8,963 for grade 8 and 8,705 for grade 12, corresponding to missing 
data rates of 13% for grade 8, and 8% for grade 12 . 

Variables 

The 1992 NAEP main assessment considered test items from the five 


content areas: 


(1) Numbers arrd Operations (whole numbers, fractions, decimals, 
integers, ratios, proportions, percents, etc.). 

(2) Measurement (describing real-world objects using memc, , 

customary, and non-standard units). 

(3) Geometry (geometric figures and relationships in one. two and 

three dimensions). ® 

(4) Data Analysis, Statistics, and Probability (data 

representation and interpretation). 

(5) Algebra and Functions (algebra, elementary' functions. ^ 

trigonometry, discrete mathematics). 

TTtere are three formats used tor the 1992 math items: conventional , 

nmltiple-choice items (binary scored), short constiucted-response .term 

(binary scored), and extended constmcted-response items. The mix o 

content and format for the test items of each grade is shown m TaWe 1. is , 

seen that the grade 8 test is dominated by Number and Operanons rt^ 

whereas the grade 12 test has as many Algebm items. About one third of the 

i«ms are short consttucted-response items, whereas less th» , 

items arc of the extended constmcted-response format. 


Insert Table I 


NAEP results are presented as test scores for each of the five content areas 
and an overall composite score which is a weighted sum of the five content 
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areas. The determination of the weights is based on what is thought 
important for smdents to know at a cenain grade level. For grade 4, the 
weights are (using the order of the five content areas given above): 45, 20, 

10. 10. 10. For grade 8 they are: 30. 15. 20. 15. 20. For grade 12 they are: 
25, 15. 20. 15. 25. It is seen that Numbers & Operations obtains diminishing 
weight over grades, whereas Geometry and Algebra obtain increasing 
weights. The weights for grades 8 and 12 correspond roughly to the item 
content mix shown in Table 1. 

NAEP uses a balanced incomplete block ("Focused-BIB") design to 
distribute the test items across the test forms. There are 13 blocks of items. 
Each of the 26 test forms ("booklets") consists of three blocks, each block 
appears in six booklets, and each block appears once with every other block. 
Tables 2 and 3 show this design for the twelfth and eighth grade tests, also 
showing how many students took each block in the samples of students used 
in the present analyses. As is seen from Table 2, this paper uses each block 
of items to create a set of testlets. A testlet is a sum of binary scored items, 
where omits are treated as incorrecL The testlets are specific to content area 
and item format. The column labelled "Format" shows whether a testlet 
consists of multiple-choice items (M) or short constructed-response items 
(Q. The column labelled "Content" uses the content area numbering given 
above. As mentioned above, there were very few extended constracted- 
response items in mathematics. Dimensionality assessment of such few 
items would not be meaningful given our aggregation of items into testlets 
and extended constracted-response items are therefore excluded in the 
present analyses. 


Insert Tables 2 and 3 



-n« use of .esUets may be crinzed as drawing on arbitrary item groupmgs. 
TOs is not an important issue here. Given the fact that each testiet ts 

specific to block, content, and format, it generanyconststs of 0^ . 

i.e.. aU items of a certain contem and format within a certam block. In 
wav, there is most often only one way to aggregate the items. A few b^. 

however, afford the creation of more than one testiet per content and format 

, / 2-5'i Items which share the 

and ate labelled a, b, c (see e.g., tes 

same stem are always put into the same testiet 


Tdbles 2 and 3 also show the degree to which the content areas and ttm 
fonnau are covered by dre testlets and the 26 independent samples ^ 
students. For example, in Table 2 grade ,2 constmcted-response Q W 

algebra (content area 5) is represented by three testlets m booklet 4 is 

available for 354 smdems in this booklet It is seen that each testtet^ 

in six booklets so that for example the algebra testUt 48 m grade 12has 

foramtalof 2.051 student. GeneraUy speaking, the content- and fo|^- 

of the testlets is similar to that of the NAEP test items shown m Tato 
Exceptions are Measurement in constructed-response format for gt«le 
and Algebra in constracted-response format for grade 8 where the items 
were spread over too many blocks to be represented by testlets. 
corresponding to these two types of items can therefore not be identified m 
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the present analyses. Tables 2 and 3 will be further referred to below in 
connection with the description of the modeling. 

The achievement variables will be related to a set of background variables 
shown in Table 4. This set corresponds to the major subgroups used in 
NAEP reporting. It is also a key set of variables used in the conditioning 
procedure used in NAEP's estimation of proficiencies in terms of the amount 
of latent variable variance explained in the conditioning. 


Insert Table 4 


Analyses 

Multidimensional latent variable modeling 

We consider a latent variable model for the set of observed variables 
corresponding to the testtets. A unidimensional model states that a sii^ie 
continuous latent variable accounts for the associations among these 
variables. In our analyses, we will expand on this model and allow a 
specific dimension corresponding to each of the five content areas and garh 
of the two formats. We will call this model a GS model (general-factor, 
specific-factor model). The model is a version of the classic "bi- 
factor" model used in Holzinger and Swineford (1939). In this way, the 
variance of a variable is accounted for by up to three different types of 


o 
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. • three sources are taken to be orthogonal 

systematic sources of vanation. The three sources . 

The first dimension IS a 

as in conventional variance component estimation, met 

u 1 cVill reouiied for solvmg these types 

general factor representing the general skiUtequirea I 

^ j Vv> seen as corresponding conceptually to 

of mathematics problems and may be seen as cone P s 

the •overaU" math score in NAEP reports. Ilte GS model descnbes speofic 
factors as residual testier covariance given dte general factor- Dev.t^ 
f^^nnldimensionality canbe described in tenns of «te v-- ^ent 
for the specific factors relative to the sum of variance components for ^ 
general and specific factors. For each variable the model adds a random 
error component to the systematic components in order to capture 
measurement enor. Given that the tesUets ate computed rom a 

nanrber of it^s. this portion of the observed variable vanancetre^^^^ 

large. Because the unreiiabiUty is accounted for, however. ^ 

cause problems. TOs ermr source of variation is a direct fimcnon of ^ 

resue. were created andisuninteresting in theconteat of our mvest.^ 

Discussions of reiaUve size of variance components for sysremauc s^ 

prefer to the reUableporrionofavariable'svariance.mappend^^^ 

a simple eaample of a GS model and presents some general 

reit. toouranalyses,U»generalfactorloadinBSwmbe.nowedtobef«^ 

«hUe for simpUcity Ute specific factor loadings are fixed atumty. 

Thiee feawres of the GS model should be noted. First, ignonng 
measurement error, the model impUes hi^y correlated content-specUlc 
scores when the specific factor variance components are 
order to compare these results with the content-factor analysts of 19W 
NAEP math data by Rock (1991) as weU as the correlations among the five 
1992 NAEP content scores, it is of interest to also present the cotrelahons 
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among the five content areas as deduced from the estimated model. As 
discussed in the appendix, these are computed as the correlations among the 
reliable pan of the content variation, purging the observations of 
measurement error. The correlations can be very high even for sizable 
specific-factor variance components. 

Second, the GS model emphasizes that the content-specific scores contain 
both general factor variation and specific factor variation (cf. Schmid & 
Leiman, 1957). If the GS model is not used, but subgroup differences are 
considered with respect to content-specific observed scores, differences in 
the underlying dimensions may be obscured. Subgroups may di^er in 
different ways with respect to the different dimensions of variation. For 
example, one subgroup may have a slightly higher general factor mean than 
another subgroup, but a much lower specific factor mean. Given that the 
general factor dominates the variation in the observed scores, the observed 
score mean difference may turn out to be zero, concealing the large specific- 
factor difference. 

Third, the GS model lends itself to viewing observed scores graphically, 
separating the general factor mean differences from specific factor mean 
differences. The idea is to give information corresponding to that of 
differendal item functioning ("item bias"): for a given general "trait" value 
on the horizontal axis, the vertical axis shows subgroup differences for a 
specific content area. In line with regression, a conditional expectation 
function may be plotted for a testlet score, or its reliable part, given the 
general factor. When the specific factor is orthogonal to the general factor it 
may be seen as a residual. This residual has different expectation in 


Wh» to — d ^ ^ 

factor as in the fuU model described in dte next section, mean of the 
specific factor conditional on die general factor is a function of die g^e.^ 
factor. Assuming a low specific-factor, general-factor correlation and a low 
specific-factor to general-factor variance ratio, the vanation m |s mean 
across general factor values is. however, likely » be smaU (e.g.. if a 
bivariate normal distribution is assumed for the general and specific factor), 
in this way. considering the conditional expectation 

subgroups, the same slope (or approximately the same slope) but dtfiemnt 
intercepts are obtained. The inteicept difference is of great substantive 
interest because it shows how differently two individuals with the sa.™ 
overaU score but belonging to different subgroups am expected to perform m 

apariicularcontentorfonnatatea. Because the general factor scorn 

mpmsents general madi skills needed to do weU on the overalltest. 

differences may represent “unrealized potential" (UP) due to lac o 

opportunity-to-leam. Figure 1 shows this idea graphicaUy for two gi^ 
V alv.ii-d A and B. where group B shows a large UP value relative to 

general factor (or overall) difference. 


Insert Rgure 1 
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The NAEP data smicture provides an important compUcation in the 

„„,deling. ThiscompUcationisshowninTableslandSgivenabove. Each 

booklet cotresponds to an independent sample of students, so 
26 independent gmups of observations. d»« « a torel of 49 dtshna 
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observed variables (testlets) in grade 12 and 51 in grade 8, for any given 
group of students only a few of these variables are observed. In this way, 
the data shows an intricate missing data pattern. Theory for structural 
equation modeling with missing data patterns of this type has been discussed 
in Muthen, Kaplan, and Hollis (1987). The solution is a multiple-group 
analysis where the 26 groups of students are analyzed jointly. Because each 
observed variable occurs in six of the groups, equalities of parameters 
involving common variables are applied across groups. Given that the GS 
model detects specific factors as residual testlet covariance given the general 
factor, the modeling is dependent on having at least two, and preferable 
more, testlets per content-and format-specific factor. To have a large 
enough sample to support stable estimation of specific faaors this testlet 
requirement should hold for at least two booklets. Tables 2 and 3 show that 
these minimum requirements are fulfilled (for multiple-choice testlets there 
is always more thtui two such testlets). 

With five content areas and two item formats, ten specific factors can in 
principle be included in the GS model. To better define the general factor, 
however, the content area of Numbers & Operations in multiple-choice 
format will not be represented by a specific factor. These types of items 
represent central math topics tested in a conventional way. In this way, the 
general factor is the only factor that influences such testlets and the goieral 
factor is therefore defined in terms of performance on these tradidonal types 
of items. Alternative specifications which include a specific factor for these 
types of items show that the results are not sensitive to this choice of 
"rotation" of the general factor. 


A Strucmral Model for Relating Achievemen, W Background (MIMIC 
Modeling) 

The multidimensional latent variable model described above wiUbe 

mcorporated in a suucmral equation model which reUtes dte factors to the 
set of background variables. Tltis type of analysis is often mferred to as 
MIMIC (multiple-indicators. muUiple-causes) modeling m stmctuial 
equation language. For appUcations to the study of group differences, see 

eg Muth6n(1989). The multidimensional model for the achievement 

v’a^ables provides the measurement part of the structural model. In this part. 
*e estimates of key interest are the percentages of the reUable vamma m 
observed variables that is due to the specific factors. As mennoned 
above, these values WiUbe interpreted as the amount of deviation from 
unidimensionaUty. The linear regression equations relating the factois to the 
background variables provide a way to describe mean differences m the 
factors with respect to the groupings represented by the 
variables in a way analogous to dummy variable regression. The 

model is shown in path diagram form in Ftgure 2 using two background 

vuisblcs* xt 


Insert Figure 2 


Tte structural regression coefficients of the MIMIC model arc interpreted 
just as orthnary partial regression coefficients. They ate ptesoited ‘ 
semdardized form, except for dummy background variables where the 
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coefficients will represent the expected standard deviation change in the 
factor when the dummy variable changes from one category to the other 
(e.g., from male to female). In these MIMIC analyses, the achievement 
variables will be treated as continuous, normally distributed variables despite 
their small numbers of scale steps and possible non-normality. Experience 
has shown that the estimates are rather robust to such deviations from 
normality. In order to decide on the number of factors that are important in 
the MIMIC modeling, initial factor analyses were performed on the 
achievement variables alone. Specific factors contributing less than 5% to 
the reliable variance were dropped before mming to MIMIC analysis. The 
MIMIC analyses were carried out in the USCOMP computer program 
(Muthen, 1987). 

Subgroup Means Estimated from the MIMIC model 

The MIMIC model shows the mfluence of background variables on the 
factors as partial regression coefficients. It is also of interest to use the 
estimated model to compute estimated means for the achievement variables. 
In this way, mean differences in observed variables can be studied for 
subgroups corresponding to key NAEP reporting variables, such as gender 
and ethmcity, providing a more direct comparison between the two ways of 
describing the data. 


The subgroup mean differences will be displayed graphically in line with 
Hgure 1 . Each graph corresponds to two subgroups to be compared, e.g., 
males and females. On the horizontal axis the estimated mean and variance 
for each of the two subgroups are used to plot an estimated distribution of 


oenerai factor values, using normal approximations. The estimated m ans 
Ind variances ate computed from die estimated model using the sample 
values for the background variables. Ttie vertical axis refers to a spec. ,c 
content area and the graph displays the estimamd regression luies of die 
conmnt area score on die general factor, one line for each of d» two 

subgroups. The two lines are determined by average parameter estunate 

values across die variables representing the content area. For simplicity, it ts 
assumed that general and specific factors are unconelated. In this case dm 
two Unes ara parallel and their slope shows the influence of the general 
factor on the specific content area scores whUe the intercept difference 
shows a conmnt area's estimated mean difference between the two 
subgroups, conditional on the general factor, -nds is the same as the 
estimated content-specific factor mean difference between the two 
subgroups. As discussed above, this difference is of prunary interest 
because it shows the extent to which individuals in different subgroups di&r 
inperfonnance in a given content area despife having the same ovendl 

(generalfactor) score. The results wittbe presenrad in the scafe of es 

deviations of the teUable portions of the observed variable 

variances. Thb standard deviation is obtained from the conditionally^ 

gWen the background variables as estimated by the MIMIC modei Gta^ 

wffloniy be shown if “pncticatty significant" deviations from 

unidimensionaUty are present, that is if the intercept difference ts 

and exceeds 0.2 of tltis standard deviation, corresponding to a "smaU effect 

siie” in ANOVA terms (a medium effect size is 0.5. and a large effect size 

0 . 8 ). 


o 
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Results 


The results of these analyses will be reponed in three steps. First, the 
percentage variance contributed by the specific factors will be presented. 
Second, the structural regression coefficients will be given. Third, graphs 
for estimated subgroup means will be presented for content- and format- 
specific sets of items conditional on the general factor. 

Results for the Measurement Part 

The estimates for the measurement part of the strucmral (MIMIC) modeling 
will be described first. The percentages of specific factor variances are 
given in Table 5 below. It is seen that statistically significant deviations 
from unidimensionality arc obtained with respect to thrce specific factors for 
grade 12 and four specific factors for grade 8. The percentages for thix^ 
specific factors arc in some cases sizable, ranging from 5-26% of the reliable 
portion of the observed variable (testlet) variation. For grade 1 2, the largest 
contributions are obtained for Data Analysis & Statistics in constructed- 
response format. Algebra in multiple-choice format, and Data Analysis & 
Statistics in multiple-choice format. For grade 8, the largest percentages of 
specific factor variance contributions are obtained for Geometry in 
constructed-response format. Geometry in multiple-choice format, and 
Measurement in multiple-choice formaL 


Insert Table 5 


to order to compare dtese results with the conterh-factor analysis of 1990 
NAEP math data by Rock (1991) and correlations among the NAEP scores 
for content areas, it is of inmrest to also present the correlations among dm 
five content areas as deduced from the model (see appendtx). These are 
given in Table 6. Tbe correlations are somewhat higher than the values 
obtained in the Rock analysis for the 1990 test and are in line wtth the 
hypothetical examples shown at dte end of the appendix. It is noteworthy 
even with such high correlations diffemntial subgroup differences can 
be found for the different factors as seen in the next sectton. 


Insert Table 6 


Pr 5 - i*« for the Structural Regressions (MIMIC Model) 

Table 7 shows dte gmde 12 estimated coefficients for the set of regiessions 
of the factors on the background variables. Many of the background 
variables show significant partial effects on several factors. TTie amount o 
variance (R2) in each factor explained by the background variables is shown 
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at the bottom of the table. The variation in the general factor is reasonably 
well explained by the background variables as indicated by the r 2 value of 
49%. 


Insert Table 7 


It is interesting to compare the estimates in the general factor column with 
the 1992 NAEP report for overall proficiency. While the Table 7 MIMIC 
model refers to partial effects of a background variable given other 
background variables, the NAEP report refers to marginal effects for one 
background variable at a time. The marginal effect for a background 
variable is the result of interactions of this variable with other background 
variables and is not easily interpreted. Following are three Table 7 examples 
of differences in the outcomes of these two ways of reporting. For gender, 
the MIMIC model shows a significantly lower value for females given other 
background, while the NAEP report does not show a significant gender 
effect. It is not clear how tte significant gender effect turns insignificant 
marginally. For Asian ethnicity, the reverse holds: the MIMIC model does 
not show a significant partial effect compared to Whites while the NAEP 
report shows a significant marginal effect. In this case, the interpretation 
may be that more Asians than Whites take advanced math courses, reducing 
the Asian effect when controlling for such course taking in the MIMIC 
model In fact, while about the same percentage of Asians and Whites tal«» 
second- or third-year Algebra (55%) and Geometry (57%), 16% of Asians 
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Calculus courses as compared to 5% of Whites ^ 

Trigorrometry as compared to 19%ofWhites. FmaUy. for school type. 

mimic model shows a signif.carrt rtegative partial effect compartrtg 
Cadtoiic schools to Public Schools, while dte N AEP repon sh^ a 
significartt positive marginal effect. The estimates from the MIMIC model 
can also be used to describe marginal effects as desertbed m the me^o 
section. For example, the MIMIC^stimated marginal 
schools versus PubUc Schools is clearly positive as m the ' 

This rough correspondence between the two approaches should hold for aU 

background variables. 

The specific-factor columns of Table 7 have a more complex interpretahon 
because these factors refer to performance on content- and format-s^ 
mst items controlling for overaU test performance (general factor v ue)^ 
content- and format-specific factor may be seen as restdu^ van^n w 

desen^es a sVem that goes beyond the general math test-takmgs 

factors may comtspond to content- and format-specific leammg of new 
topics involving definitions, new concepts, and new pmcedures. andtagh 
values may correlate widi high degrees of opportunity-to-leam for suds 
specific topics. Tte specific factors M-Geom and C-Geom may be seen as 

vdidatedbythestiongspecific-factoreffectsfromGeometiyand 

Trigonometry course taking as compared to not taking such 
specific factor M-Algebra may be seen as validated by the strong spec^- 
factor effect from Calculus course taking. It is trae that the students g 
sm* advanced counes are on die whole mote able at math, reflectmg a 

selection phenomenon. The selection effect is. however, largely accoan 

for by the strong general factor effects seen for these course-taking 
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categories and the specific-factor effects describe difference beyond such a 
general advantage. 

The estimates in the M-Aigebra specific-factor column for the Ethnicity 
background variables are noteworthy. They indicate that Blacks, Hispanics 
and Asians all have significantly higher M-Algebra values than the reference 
group of Whites (see also the Geometry columns for similar results). While 
Asians are significantly ahead on the specific M-Algebra factor, they are not 
significantly ahead of Whites on the general factor, other background 
variables held constant. This is an example of the multidimensional factor 
model being able to point to components of subgroup differences that are 
overlooked in terms of overall performance. Tire specific-factor finding is 
perhaps due to differences in opportunity-to-leam as a function of different 
course-taking choices. This Asian- White analysis result is relatively easy to 
describe. For Black and Hispamcs, however, the M-Algebra advantage* i.e., 
the White disadvantage, is at first puzzling given their strong general-factor 
disadvantage relative to Whites. This can be understood by describing the 
situation as the White advantage on the general factor not leading to a fiilly 
comparable M-Algebra performance advantage, so that tire model needs to 
moderate the White general-factor advantage by a lesser M-Algebra eff ect 
for Whites than for Blacks and Hispanics. This type of reasoning may 
explain the two negative effects in the M-Data column for Alg-Calc course 
taking. 

The possibility of differential effects of background on the different factors 
is an interesting feature of tire multidimensional MIMIC model which make s 
for a richer representation of the data. Examples of differential and even 
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oppositeeffects are found wi^^spect to both content .ndfonn.^ 

Z example, the partial effect of being female is signiftcantW negattve for 
hre general factor, while signif, candy positive for dte Algebra-spectf.c actor 
h, multiple-choice format and the Data Analysis-specftc factor m ^ 
constructed-response format. ^ partial effect of Asttm 
smaU and insignificant for the general factor but large for the M-Ge 
M-Algebta factors. In terms of format differences. Data ysis 

LsLshowsformatdifferencesforFemalesandforBlacicstmbo^ 

performance in these groups is better on constructed-response .terns 
multiple-choice items. 

Table 8 shows the corresponding grade 8 MIMIC model estimates. In tenns 
of differential effects of background on the factors, it is mteresnng to 

eonsiderthebackgroundvariableGender. We find that with other 

background variables held constant, females are significantly higher than 
™aes on d« general factor, but significantly lower on d« Meas—- 

specific factor (in multiple-choice format). Geometry shows d eren 

^lationships for the constructed-response format than for the multiple- 
choice format forfemales andfor Blacks: here, females do better on 
constructed-response format and Blacks do better on the multiple-cto.ce 

format. It is also interesting to note that, as compared to grade 12. the 

Asian-WWte difference for Geometry has not yet developed. Itsho 

noted, however, that the amount of variance explained in the specific factors 
is very low for grade 8. 
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Insen Table 8 


Results for Subgroup Means 
Estimated from the MIMIC model 

The following graphs show the estimates derived from the MIMIC model for 
subgroup mean differences in a given content area conditional on the general 
factor value. To limit space, only results for gender and ethnicity will be 
presented. As stated in the methods section, graphs are only presented if 
"practically significant" deviations from unidimensionality are present, 
requiring specific factor mean differences that are significant and at least 0.2 
of a standard deviation of the reliable variation in the observed scores. 

Gender comparisons 

Grade 12 gender comparisons show no practically sigruficant deviations 
from unidimensionality for any of the specific factors. Figure 3 shows a 
grade 8 gender comparison for the Measurement-specific factor in multiple- 
choice format As shown in Table 5, this specific factor contributed 
approximately 20% of the reliable variation in the Measurement content area 
scores. The MIMIC results of Table 7 indicated that the partial effect of 
being female was positive, although rather smalL The general factor 
distributions of Figure 3 also show that the marginal effect of being female 
is slightly positive. These results are in line with the 1992 NAEP report 

26 


(MuUis et aL. 1992) for the overall math score vtewmg the overaU math 
score in NAEP as a proxy for the general factor score. Conditional on the 
general factor score, however, males are ahead of females in Measurement 
performance. Had we not condiuoned on the general factor, this gender 
difference in Measurement performance may not have been uncovered 
because the general factor dominates as a source of variation in the 
Measurement performance. TTie NAEP Data Almanac for 1992 math 
^fleets this in that the gender mean difference is not significant and is only 
about 0.1 of a standard deviation. TOs female Measurement disadvantage 
may be seen as •unrealized potential" among females. While females do as 
weU as males on the overaU test, they fall behind in this particular area. It 
may be noted that the gender effect for Geometry is smaller than for 
Measurement (about 0.13 of a standard deviation as opposed to about 0.20). 


Insert Figure 3 


Figures 4 and 5 show the effects of different item formats. These fignies 
compare male and female grade 12 performance on Data Analysis & 

showing that in compaiison to males, the constructed-responsc 
format suits females better than die multiple-choice format. While netther 
graph shows a large specific-factor difference, the reversal from a male 
advantage in Figure 4 (multiple choice) to a female advantage in Rgute 5 
Still makes these two figures notewordiy. 
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Insen Figures 4 and 5 


Ethnicity comparisons 

Figures 6 and 7 show grade 12 Asian- White comparisons for Geometry 
(multiple-choice) and Algebra (multiple-choice). In both cases, Asians are 
ahead of Whites on the general factor and, conditional on the general factor, 
further ahead on Geometry and Algebra in multiple-choice form. The 
general factor difference in these two cases is rather small, less than 0.2 of a 
standard deviation. In contrast, the multidimensional MIMIC model is able 
to show that there are strong Asian- White differences with respect to 
specific Geometry and Algebra content and format, almost 0.4 and 0.6 of a 
standard deviation, respeaively. As discussed in connection with Table 7, 
these differences may have to do with Asians taking more advanced courses 
than Whites. These differences may not show up as strongly in the observed 
scores because the specific factors only account for 12 and 16%, respectxveiy 
of the reliable variances (see Table 5). the remainder corresponding to the 
dominant general factor variance. In this connection it is interesting to note 
what this finding says about the influence of test content on subgroup 
differences: had the 12th grade math test had more Geometry and Algebra 
content, the overall Asian- White difference would have been larger. 


Insert Figures 6 and 7 


Figure 8 shows a grade 12 Black-White comparison for Data Anal^s 
Statistics (muitiple-choice) indicating a conditional advantage for Whrtes. 
is noteworthy that despite such a strong White advantage for the general 
factor, this cannot fuUy explain the White advantage on these types of items. 
The specific-factor difference may have to do widi lack of opportunity-to- 
learn for Blacks as compared to Whites for Data Analysis & Statistics type 

items. 


Insert Figure 8 


Figure 9 shows a grade 12 Black-WWte comparison for Algebra m multip e- 
choice fomiat indicating a reversal in the comparisons of die two subgroups 

forthegeneralversusthespecificfactors. The Black specific-factor 

advantage was mentioned in connection with the Table 7 results. The 
general-factor advantage is not realized for these types of items. Peihaps 

*is is due to flieie being only a smaU degree of overlap m the two gene^- 

6ctor distributions, so that the data supporting the two lines come mo y 
from high-performing Blacks and low-petifotming Whites. 


Insert Figure 9 
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Figures 10, 11, 12 show grade 12 Black- Asian comparisons for Geometry 
(both formats) and Algebra (multiple-choice). In all cases, there is a 
specific-factor advantage for Asians which goes beyond the Asian general- 
factor advantage. Again, given that the subgroup differences pertain to more 
advanced topics, these advantages may have to do with opportunity-to learn 
differences. 


Insen Figures 10-12 


Figures 13 and 14 show grade 12 Hispanic-Black comparisons indicating a 
conditional Hispanic advantage for Data Analysis & Statistics (multiple- 
choice) and Geometry (constracted-response). The specific-factor difference 
is in both cases larger than the general-factor difference. One may note that 
the Data Analysis & Statistics finding is analogous to the White-Black 
comparison of Hgure 8. 


Insert Figures 13, 14 


Hgures 15, 16, 17 show grade 1 2 Hispanic-Asian comparisons. Figures 15 
and 16 indicate a conditional Asian advantage for Geometry (multiple- 
choice) and Algebra (multiple-choice) as was the case in the White-Asian 
comparisons. Figure 17 shows an Asian disadvantage for Data Analysis & 
Statistics (multiple-choice) despite an Asian advantage for the general factor. 
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The interpretation of Figure 17 may 
data supporting the two lines come 
and low-performing Asians. 


be similar to that of Figure 5 in that the 
mostly from high-performing Hispanics 


Insert Figures 15-17 


Figures 18 and 19 show grade 8 Asian-White comparisons. Figure 18 shows 
that for die Measurement-specific factor in multipie.hoice format there is a 
mversal in the effects for the general and the specific factors: Asians am 
ahead of Whites on the general factor, but Whites have condrttonally ht^r 

values on Measurement. Figum 19 shows that for Date Analysis feStedsttcs 

in multiple-choice format an analogous reversal is seen. The NAEP Dm 
Almanac shows that Asians obtain higher means in both content areas, bnt 
that the mean differences are insignificant 


In sert Figures 18, 19 


Discussion 

This paper has found multldimensionaUty in the 1992 NAEP math items. 
This has an impact on the description of subgroup difierences. In seve 
die multidimensional description of subgroup differences was 

^ M identify subgroup differences in content- and format-specific factors 
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which were different from overall subgroup differences. This type of 
description indicates that the fmding of highly correlated content-specific 
subscores does not necessarily suggest reponing only subgroup differences 
with respect to an overall score, but that reporting of conditional, content- 
specific scores may be used. 

Studying subgroup differences with respect to specific factors may lead to a 
more "instractionally sensitive" way to analyze achievement data. Take, for 
example, the Asian- White difference with respect to Algebra shown in 
Figure 7. The specific-factor difference is almost 0.6 of a standard deviation 
(of the reliable part of the Algebra score) while the general factor difference 
is less than 0.2 of this standard deviation. The fact that Asian and White 
individuals with the same general factor value can differ this much with 
respect to what is specific to algebra raises the possibility of "unrealized 
potennal" of the White student subgroup relative to the Asian subgroup. 
Another example is provided by the Figure 1 3 Hispanic-Black comparison 
for grade 12 Data Analysis & Statistics, suggesting that Blacks have 
unrealized potential relative to Hispamcs. Such differences ran reveal 
important educational process differences related to curricular emphases, 
differences in opportunity-to-leam, and the effects of differential course 
choices. It would be of interest to attempt to study such differences over 
time and to explain how they arise. As examples of other such specific 
factor differences worthy of further investigations one may also mention the 
Male-Female difference with respect to Measurement, the Asian- White 
difference with respect to Geometry, and the Black- White difference widi 
respect to Data Analysis & Statistics. To understand these differences. 


however, it is likely that a much richer set of explanatory background 
variables is needed than was used here. 

■n« differential subgroup differences for dte different factor dimensions =dso 
clearly show how dependent subgroup differences are on the particular mix 
of content and format that is used for the test items. For example, m 
comparison to males, females appear to do relatively better on constnicred- 
^ponse items than multiple^hoice items for Data Analysis & Stat.st.cs m 
grade 12 and Geomeuy in grade 8. This has impUcanons for future 
developments of N AEP testing and the comparison of performance over 
time. One can expect a trend towards using more constructed-response 
items, reducing the reliance on the muWple-choice format The parti 

contentmixandthecontentweightsmayalsochangeovertime. 

Tte 1992 math fmdingsreporredhererepUcareinsomeresperns analyses of 

*e 1990 NAEP mad. data (Muthdn. 1991). In bod. cases, a MIMIC 

appmach was mken. but analysis procedures were different in three reg«ds. 
Due to the different BIB spiraling stractures. the two data sets gwe nse to 

differentwaysofcreadngtestlets. The 1990 data made it possibte to a|^ 

a set of testiets in seven repUcate analyses of seven booklets. wWle m 19 

are analysis needed to be done sirnultarreously on an are 26 booklets. In te 

1990 analyses no Asian-White or Black-Hispanic comparisons were made 

andnoformat-specifictestletsorfactors were formulated. Despite these 

differences, it is inreresting «, nore that dre 1992 grade 8 conditional 
Measurement disadvantage for females was also observed in analyses o 
1990 NAEP math data. Furdrermore. the 1992 grade 12 Black-White 
comparison for Data Analysis & Statistics indicating a conditional advantage 
for Whites was also observed in analyses of the 1990 NAEP mad. data. 
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The latent variable technique used in this report provides a general 
methodology for data structures of the NAEP type. It gives flexibility for 
the researcher in that NAEP items and background variables are used 
without having to rely on the panicular proficiency scores that are generated 
for NAEP reports. Conditioning variables are not used to generate scores. 
Such background variables can instead be incorporated in the analysis as 
done in the MIMIC model. This approach therefore provides a way to 
validate fmdings from regression analyses based on NAEP proficiency 
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Appendix 


When dau are generated by a single dominant dimens, on and 
dimensions, it is easy to settle for unidimensionality unless a spec.^ effo 
made to find the additional dimensions. Tire foUowmg latent v^a e ^ 

is a usefifi tool for detecting such deviations from umdunension y^ 
modelisaclassic'^l-factor'model(seee.g.Holainger&^^^^^^^ 
with one general factor and one specific factorfor each observed vanable. 

^ rhe classic case, dte speciftc factors are uncorrelated « « 
and wid. the general factor. This latent variable model w.U be refet-d to as 
a as model (general-factor, specific-factor model). This mo e w, 
modified here to include covariates of the general and specific factors m 

whlchcaseaUfactorscanbecorrelatedasafnnctton^^ 

dependence on the covariates. Dus modifiedGS model is the MMC 
model (multiple-indicators, mulfiple-causes model) used in die a^y^ of 
*e paper, m modified GS model is a good vehicle for tUustmtmg h 
multidimensional models may be mistaken for unidimens.onal models. 


(imsider the foUowing GS model for ten observed variables y . , 

( 1 ) 


yi= G + 

el 

y2= G + 

e2 

y3= G + 

e3 

y4= G + 

e4 

y5= G + 

e5 

y6= G + 

e6 
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where G and S are the general and specific factors, respectively, and e's 
represent measurement errors. For simplicity the above GS model has unit 
loadings everywhere. Consider next the stmctural regressions of the factors 
on a covariate x, 

( 2 ) 


G = bgx + rg 
S = bjX + r, 

where the b's are regression coefficients and the r's are residuals. While the 
residuals are uncorrelated so that G and S are uncorrelated given x, the 
marginal correlation betweeen G and S is not zero. The point of involving a 
covariate x is the following. Using infoimadon on the y’s alone, the 
correlation between G and S can only be identified under very restrictive 
specifications such as using fixed loadings. Adding information on x's, 
however, makes it possible to identify the structural regression coefficients 
and thereby allows G and S to correlate as a function of their common 
dependence on x. In such a model, the residual correlation for G and S is 
zero and no restrictive specifications are needed for the loadings. This 
appendix considers what happens in the conventional approach of analyzing 
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only the y s and inconectly applying a one-factor model when a modKed GS 
model is the true model. 

Assume for example that the fim six y variables cormspond to NAEFs 
Numbers & Operations items and the last four y vanables correspon to 
Algebra items. Or. alternatively, tot to first six y variables correspond to 
multiple-choice items for a certain content area and to last four y varia 
correspond to constructed-response items for to same content area. Usmg 
to first example. S corresponds to algebra-specific skills tot go beyon 
Numbers & Operations skills needed to solve to algebra items repmsente 
by y variables 7-10. A useful index of to degree to which to model 
deviates from unidimensionaUty is to specific factor variance rano 

(3) V(S) / ( V(G) + V(S) -t- 2 Cov (G, S) ) . 

where to covariance is zero in to classic GS model but possibly nonzero in 
to modified GS model with covariates. TIus rario does not mvolve the 

variable-specific amount of measurement error variance. The proportion 

residual variance, or unreliabiUty. in a y variable depends on the number of 
items used to form the testtets. It U advantageous that to ratio does not 
depend on dus arbitrary choice. Here, reliability is defined as 


(4) 

{ V(G) + V(S) + 2 Cov (Ga S)} / 

[ V(G) + V(S) + 2 Cov (Ga S) + V(e)) , 
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where for y variables 1-6 to terms V(S) and Cov (G, S) disappear. 
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The reliable pan of the variation in the six Numbers & Operations variables 
is G and the reliable pan of the four algebra variables is G + S. The 
correlation between these two reliable parts is 


(5) ( V(G) + Cov (G, S) } / 

{ Sqn [ V(G) ] Sqn [ V(G) + V(S) + 2 Cov (G, S) ] } . 

In contrast to this correlation, the correlation between Numbers & 

Operations y variables and the Algebra y variables is attenuated because the 
measurement error variances add to the denominator of the expression 
above. The amount of attenuation depends on the reliability of the variables, 
which again depends on the number of items used to form the testlets. 

The correlation given in (5) has further meaning. It is also the correlation 
that is obtained between the two factors of a two-factor, simple-structuie 
confirmatory factor analysis model with correlated factors fitted to the y 
variables of the modified GS model. This is easily seen from (1) if factor 1 
is defined as G and factor 2 is defined as G + S, letting variables 1-6 load on 
factor 1 and 7-10 load on factor 2. The fact that a correlated, two-factor 
model fits the GS model perfectly relates to hierarchical factor analysis 
transformations discussed in Schmid and Leiman (1957). 


Using different choices of specific-factor variance ratio, G-S factor 
correlation, and variable reliability, a set of covariance matrices for the ten y 
variables were created and analyzed by a one-factor modeL The values were 
chosen to be close to those seen in the NAEP analyses: the MIMIC- 
estimated grade 8 and 1 2 specific-factor variance ratios typically ranged 




■ rnrihe ocncral faciot were 0.19 

from u 1 toO '• »rade 12 factor correanons lor me ,ener 

w.„h the specific factor of Geometry- tmaluple<ho.ce . and " 

<necmc factor of -Xlgebra tmuitipie-choice.: a t>T>icai value tor the tesdet 

reiiabiiilv was around 0.4 wtuie in Rock i IVV i ) ' 

„iptc were used (taking the square root of each 
aive'^ ihat more iieccis per lesiieis - 

- “ . . xoW#» \ I shows that ihev correspond lo 

die three reliabmiN- values given m Table Al shows ' 0 7^ The 

one-factor standardized loadings of approximately 0.9. 0.8. and 0.7V ^e 

parameter values chosen for Table M give a 0.85.).97 range for^ 

^ whicll IS Ul UllC Wltil tHC ROCjC 

factor correlation values lusmg equanon o cn h as 

• ,h, five content areas of the 1990 NAEP math data as 

1 1 99 U tinoinss lor the live conteni <uc _ , , 

weU as the corresponding results for the 1992 data given in this 

. f aives dre chi-stpiare values of fit for the misspecifled one-^^^^ 

whe;analvzingasampleofn=500. Tbe model has 35 df. 

GSfaaorcotreladon vanes hutforsimpUcity the speciflc-famorva^ 

^oeiveninTableAl uses fotmulatSlwiththeO-Scovariance set toxen,. 


Insert Table A.l. 
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u is seen that several combinadons of parameter values give an acc^ 
f„ to the incorrect one-factor modeL implying dtar the power to reject to 
nrodel is low. TOs occurs for low speciftc-factor variance rano. low G 

factorcorrelation.andlowvariablereliabiUty. One such case whrc^^ 

.0 use tv-pical parameter values based on the N AEP analyses, has s^c- 
factor variance tario of 0.2. G-S fanor correlation of 0.2. and re 
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0.5. The chi-square value is 24.71 in this case (p=0.902). The chi-square 
values are linear in the sample size so that with a sample of 1,000, a value 
twice as large would be obtained. Looking up the 5 % critical value for 35 
df.’s (approximately 49), one can also calculate that in this case a sample size 
of 992 would be required to reject the one-factor model at the 5% level. 

For this case, the correlation between the reliable pans of the two types of 
content variables is 0.91, i.e. a two-factor simple-stnicmre confirmatory 
factor analysis model would have a faaor correlation of 0.91 (this is 
independent of the reliability). Had a two-factor model been fitted to these 
data, such a high value is likely to also lead an investigator to maintain the 
one-factor model. The corresponding factor correlation for a specific-factor 
variance ratio of 0.1 is 0.96. 


TabicM. Chi-square test values for rnasspecmed one-factor model (35 df. n=500) 


qolirahiliW of V1 to Vft = 




o < 

V(S) 

0.30 

V(e1) 

0.175 

V(e7) 

0.32 

r(G.S) 

0.1 

0.2 

0.3 

0.4 

0.5 

Cov(G.S) 

0.05 

0.09 

0.14 

0.18 

0.23 

ReUy7-yio) 

0.77 

0.79 

0.80 

0.81 

0.82 

2-Fac Corr 
0.85 
0.87 
0.89 
0.90 
0.92 

V(G) 

0.80 

V(S) 

0.20 

V(e1) 

0.20 

V(e7) 

0.40 

r(Q.S) 

0.1 

0.2 

0.3 

0.4 

0.5 

Cov(G.S) 

0.04 

0.08 

0.12 

0.16 

0.20 

Rel(y7-yl0) 

0.73 

0.74 

0.76 

0.77 

0.78 

2-Fac Con- 
0.90 
0.91 
0.92 
0.93 
0.94 

V(G) 

0.88 

V(S) 

0.10 

V(e1) 

0.22 

V(e7) 

0.30 


V(S)/IV(G)+V(S)1 



0.30 

CW-sq 

prob 

435.16 

0.000 

404.24 

0.000 

364.66 

0.000 

320.21 

0.000 

261.93 

0.000 

V(S)/IV(G)+V(S)1 


0.20 

CW-sq 

prob 

197.31 

0.000 

183.61 

0.000 

164.91 

0.000 

141.87 

0.000 

115.42 

0.000 


V(S)/IV(G)+V(8)1 

0.10 


r(Q.S) 

0.1 

0.2 

0.3 

0.4 

0.6 


Ccw(Q.S) 

0.03 

0.06 

0.09 

0.12 

0.15 


ReUy7-yio) 

0.78 

0.79 

0.79 

0.80 

0.81 


2 -FacCorr 

0.95 

0.96 

0.96 

0.96 

0.97 


92.92 

88.81 

77.59 

64.49 

54.12 


prob 

0.000 

0.000 

0.000 

0.002 

0.021 



a«iiflhH»v of v1 to v8-= 






V(G) 

0.70 

V(8) 

0.30 

V(e1) 

0.38 

V(e7) 

0.70 

V(8)/IV(G)+V(8)1 

0;30 

r(G,S) 

0.1 

0.2 

0.3 

0.4 

0.5 

Cw(G,S) 

0.05 

0.09 

0.14 

0.18 

0.23 

Ral(y7-yl0) 

0.61 

0.63 

0.65 

0.66 

0.68 

2 -Fac Con 
0.65 
0.87 
0.89 
0.90 
0.92 

CW-sq 

150.46 

136.67 

120.06 

102.42 

80.67 

prob 

0.000 

0.000 

0.000 

0.000 

0.000 
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V(G) 

V(S) 

V(e1) 

V(e7) 

V(S)/(V(G)+V(S)1 

0.80 

0.20 

0.43 

0.70 

0.20 

r(G.S) 

Cov(G.S) 
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Figure I 


Conditional representation of multidimensional 
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Figure 2 


Path diagram for MIMIC model 
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Figure 3 


Grade 8 gender comparison for the Measurement-specific factor 

in multiple-choice format 


NAEP '92 Grade 8 


Female vs Male 
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Figure 6 


12 Asian-White comparisons for Geometry 
in multiple-choice format. 
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Figure 7 

12 Asian-White comparisons for Algebra 
in multiple-choice format. 
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Figure 8 

„ . • Data Analysis &. Statistics 

Black- White companson for uata nn‘uy 

in multiple-choice format. 
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Figure 9 


12 Black- White comparison for Algebra 
in multiple-choice format 
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Figure 10 


Black-Asian comparison for Geometry 
in multiple-choice format 
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Figure 1 1 


12 Black-Asian comparison for Geometry 
in constnicted-response format 
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Figure 1 2 


Black- Asian comparison for Algebra 
in multiple-choice format 
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Figure 13 


Grade 12 HispardcBlack comparisot, for Data Artalysis & Statistics 

in multiple-choice format. 
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Figure 14 
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12 Hispanic-Black comparison for Geomeiry 
in constructed-response format 
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Figure 15 


12 Hispanic- Asian comparison for Geometry 
in multiple-choice format 
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Figure 16 


12 Hispanic- Asian comparison for Algebra 
in multiple-choice format 
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Figure 17 


12 Hispanic- Asian comparison for Data Analysis & Statistics 
in multiple-choice format 
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Figure 18 


Grade 8 Asian-White comparison for dte Measurement-specific faaor 

in multiple-choice format 
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Asian- White comparison for Data Analysis 
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"able 1. Item content and format mix 
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» er|c 


g T^QC 12 

rormit 

MuiUole choice 


Number oi items 


Content 


% ot total 
% of content 
% of format 


SunXtOp Meaiurement 


Geometry 


Data analysis 


Algebra 


Short constructed responsa 

NumbflT oi items 


% of total 
% of content 
% of format 


29 

16.20% 

25.00% 

65.91% 

15 

8.38% 

26.32% 

34.09% 


Extended constructed response 

Numbs ot items 


^ of total 
of content 
of format 



18 

10.06% 

15.52% 

64.29% 

10 

5.59% 

17.54% 

35.71% 

0 

0 . 00 % 

0 . 00 % 

0.00% 


20 

11.17% 

17.24% 

64.52% 

10 

5.59% 

17.54% 

32.26% 

1 

0.56% 

16.67% 

3.23% 


17 

9.50% 

14.66% 

58.62% 

1 1 
6.15% 
19.30% 
37.93% 


0.56% 

16.67% 

3.45% 


32 

17.86% 

27.59% 

68.09% 

11 

6.15% 

19.30% 

23.40% 

4 

2.23% 

66.67% 

8.51% 


Total 


Number of items 


Total 


116 

100 . 00 % 

64.60% 

57 

100 . 00 % 

31.64% 


100 . 00 % 

3.35% 

179 

100 . 00 % 

100 . 00 % 



Format 

VtuUlpla chotoa 


{^iBibsof items 


% cf total 
%el content 
% of format 


Short eonstiuetod leaponse 

items 


% of total 
%of«ontmt 
%of format 


Extmdod eonstfttotsd wsponss 

Isiumbs of items 


% of total 
%of content 
«of format 


ToUl 


items 


%ofeontcnt 
% of format 


41 

22.40% 

34.75% 

70.69% 

15 

8 . 20 % 

25.42% 

25.66% 

2 

1.09% 

33.33% 

3.46% 

56 

31.69% 

100 . 00 % 


19 

10.36% 

16.10% 

50.36% 

12 

6.56% 

20.34% 

37.50% 

1 

0.55% 

16j57% 

3.13% 

32 

17.49% 

100 . 00 % 


20 

10.03% 

16.05% 

55.56% 

15 

8 . 20 % 

25.42% 

41.67% 


0.55% 

16.57% 

2.76% 
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10.67% 

100 . 00 % 
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oao% 

14.41% 

60.71% 
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5.46% 

16.05% 

35.71% 

1 

0.55% 

15.57% 

3.57% 

26 

15.30% 

100 . 00 % 


21 

11.46% 

17J0% 

72Jl1% 

7 

3.63% 

11 . 66 % 

24.14% 

1 

0.59% 

16J57% 

3.46% 

29 

1SJ6% 

100 . 00 % 


116 

100 . 00 % 

64.46% 

50 

100 . 00 % 

32.24% 

6 

100X0% 

3.29% 
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100 . 00 % 
100 . 00 % 
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Tables. Average percentage 

Factor 


contribution of specific factors to reliable testl et variation 

T-vaiue % Contribution 


Variance 


^tapp’Q? grade 12 

\. General 

2. M-Measurement 

3. M-Geometry 

4. M'Data Analysis U Statistics 

5. M* Algebra 

6. C-Numbers 6c Operanons 

7. C-Geometry 

8. C-Data Analysis 6t Statistics 

9. C-Algebia 

T^tattp'QI grade! 

1. General 

2. M-Measurement 

3. M-Geometry 

4. M-Data Analysis 6c Statistics 

5. M-Algebia 

6. C-Numbers 6c Operations 

7. C-Measurcment 

8. C-Geometry 

9. C-Data Analysis 6e Statistics 


0.09 

0.00 

0.05 

0.04 

0.06 

0.00 

0.03 

O.IO 

0.00 


0.84 

0.10 

0.10 

0.06 

0i)2 

0D4 

0.03 

025 

OJOO 


n.oo 

2.45 

i.n 

4.07 

0.48 

2.74 


80.40 

1027 

17.40 
1327 

523 

1920 


2122 

79J05 

423 

14J8 

326 

2347 

2.13 

1144 

0.69 

•• 

128 

725 

0.42 

— 

8.49 

2S23 


MsMultiple choice 
C=Constructed response 


Table 4. 


Background variables used in the structural model (NAEP ’92) 


Sample Size 


8963 

1. Gender 


% in Grade 


*1 Male 

51 


2 Female 

49 

2. Ethnicity 


*l White 

67 


2 Black 

16 


3 Hispanic 

14 


4 Asian 

3 


3. Paicnis’ Education (Smdeni Reported) 


1 Didn't Rnish High School 

2 Grad From High School 

3 Some Ed After High School 

4 Grad From College 


4. Type Of Community 


1 

2 

3 

•4 


Extreme Rural 
Disadvantaged Urban 
Advantaged Urban 
Other (Non-Extreme) 


5. School Type 


*1 Public School 

2 Private School 

3 Catholic School 


6.- Algebra (Course Taking) 


1 Pie-Al^bra/Algebra 
*2 No Algebra/Other 


7. Alg-Calc (Course Taking) 


•1 Prc-AIgebra/lst-Year 
Al^bra/Not Studied 
2. 2od^rd- Year Algebra 
3 Calculus 


8. Geom-Trig (Course Taking) 

*1 Not Studied 

2 Geometry 

3 Trigonometry 


9. School Program 


*1 General 

2 Academic/College Prep 

3 Vocaiional/Technical 

4 Other^mined 


9 

25 

20 

47 


8 

10 

11 

71 


79 

8 

13 


44 

56 


8705 

% in Grade 12 


49 

51 


69 

17 

10 

4 


8 

22 

26 

44 


11 

13 

12 

64 


80 

7 

13 


44 

52- 

4- 


26 

56 

18 


22 

26 

48 
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Glories in the background variables are all dummy coded exceot for Parents' 

Variables, effects are interpreted as *e category in 
quesnon compared to base category (marked *) of the vanahic 


coemden« fvaiues, .'rom >ha s<ruc»rai moael (NAEP '92 grade 12) 



rcmaie 


I'.hmcitv 


Black 


Hispanic 


Asian 


Parents Ed. 


-0.705 

(16.12J 

-0.402 

( 1006 ) 

0.015 

(028) 


0.107 

(823) 


0.275 
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(3.00) 

0.673 

( 2 . 89 ) 


-0.006 

(0J2) 


-0.977 
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0.050 

(020) 

-0.425 

-027) 


OjOSO 

(074) 


0.608 

(523) 

0302 
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1.099 

(579) 


0.025 

(021) 


-0.275 

(025) 

0.711 

(223) 

0.734 

(1.47) 


0.087 

(030) 


-0288 
-a 39) 

•0362 
-a 35) 

-0337 

-a3S) 


•0.040 

-<034) 


TOC 


Rural 


[^isaav -Urban 


Adv-Urban 


0.191 

(553) 

-0.149 

-(454) 

•0354 

•<133) 


0.076 

(055) 

0.072 

(0.48) 

•0226 

•031) 


0.021 

(0.13) 

0547 

( 230 ) 

0318 

039) 


0.197 

(156) 

O.IOS 

(037) 

■a054 

•<03S) 


J3.048 

( 0 . 12 ) 

-0.054 

( 021 ) 

0.175 

(036) 


0270 

0.48) 

0D99 

(030) 

0.181 

(036) 


School-Type 

Catholic 

Private 


-ai3S 0^ 

.(422) 

OJ097 -0^ 

<2J9) 


-ai44 -0-088 

.<074) 

.0.467 0-^^ 

-031) <325) 


-0.426 

-030) 

0284 

(071) 


.\lg-Calc 

Algebra 

Calculus 


0394 

(1235) 

0.849 

(1236) 


•0.126 

(134) 

0259 

038) 


•0594 

(332) 

•0369 

•(234) 


0.136 

(133) 

0532 

(435) 


-0.604 

(2.42) 

0342 

(038) 


■OilTO 

■(030) 

4US9 

•038) 


Coom-Trig 

Ooiretty 


Trigonometty 


0363 

(1325) 

0595 

(1233) 


1.149 

(9j04) 

1218 

(721) 


•OJOIO 

•(0J03) 

•0237 

-031) 


OJ062 

(034) 

0.420 

(3j08) 


0J938 
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0516 

(237) 


OJO09 

(032) 

-OJ068 

•4016) 


School-Program 

Academic 


Vocational 

Other 


0.422 

(1236) 

•0.076 

•032) 

OJ019 

(070) 


0J024 

(020) 


•0202 

•0.19) 


•0228 

-(031) 


0J049 

(0J4) 


0j06S •0312 

(030) -<337) 


0264 

(232) 


•aiss 

.(038) 


0.130 -0^ 

(052) -<®^ 


•0.155 

•035) 


0017 

(004) 



0.261 0^ 


■03S8 

•025) 

0054 

(014) 

-0044 

-<017) 


0.122 


Table 6. Estimated content factor correlation 


NAEP92’ Gradel2 


Num Op 

1.000 



Measiirement 

1.000 

1.000 


Geometry 

0.983 

0.983 

1.000 

Data Analysis 

0.969 

0.969 

0.948 

Algebra 

0.990 

0.980 

0.976 


NAEP92’ Grades 


Num 4c Op 

1.000 



Measurement 

.879 

1.000 


Geometry 

.945 

.844 

1.000 

Data Analysis 

.985 

.878 

.943 

Algebra 

.985 

.877 

.943 


1 .000 
0.953 


1.000 

.982 
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Table 8. Standardized coefficients (& t-values) from the 

Ceroral 


structural model (NAEP ’92 grade 8) 

M-Data C-Numb^- C-Geom < 


Fcmaie 

Ethnicity 

Black 

Hispanic 

Asian 

Parent s Ed. 
TOC 

Rural 

Oisadv-Urban 

Adv*Urban 

School-Type 

Catholic 

Private 

Algebra 

R Square 


0.048 

(2J1) 

-0.466 

■(638) 

-0258 

•(333) 

-0.130 

■(134) 

0292 

(223) 

-0514 

-(022) 

-0JBS\ 

-(24j6i) 

-0525 

<1576) 

0739 

(371) 

-0.404 

-f3.47J 

-0.127 

•am 

-0.405 

■(154) 

■OJ0O2 

■(036) 

-0587 

•(046) 

-0223 

■(030) 

-0.442 

-(257) 

-0.465 

■(279) 

-0581 

■(2J7) 

0.415 

(251) 

-0289 

-(1J9) 

-0.423 

•(159) 

4 

-0401 

•(354) 

-0.130 

-a23) 

-0258 • 

•(151) 

0.194 

( 16 j 61 ) 

■oxns 

•(037) 

-0517 

0.013 

(032) 

-0.154 

•(231) 

•0526 

-(020) 






4 

•0SD2 

■(OjSO) 

•0287 

<7J1) 

0304 

(8M) 

■ojoa 

■(034) 

-0507 

•032) 

-0.117 

•(030) 

0541 

(037) 

0213 

0202 

041) 

0537 

(030) 

-0213 

■(134) 

-0.155 

-(OJS) 

0A26 

ajj) 

0510 

(05S) 

-0549 

<136) 

-0530 

•(023) 

-0533 
-(027) 1 

•0231 

<139) 

0.129 

(4M) 

ojoeo 

as9) 

-0252 

•(230) 

0.105 

(0723 

-0.102 

-fOL79) 

05S8 

(055) 

0566 

(0.40) 

0.160 

(026) 

0258 

a27) 

ai3i 

(050) 

•0539 i 
-(058) 

0598 

(026) 

0548 

(2Z69) 

-0.103 

•(134) 

-ai67 

-ai2) 

0.159 

aJ6) 

-0252 

-(123) 

•0188 4 
-(2553 

0381 

0302 

0535 

0584 

0366 

0535 
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